首页> 外文OA文献 >Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters

【2h】

Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters

机译：利用共享缓存进行模板代码的并行时间阻塞在多核处理器和集群上

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Bandwidth-starved multicore chips have become ubiquitous. It is well knownthat the performance of stencil codes can be improved by temporal blocking,lessening the pressure on the memory interface. We introduce a new pipelinedapproach that makes explicit use of shared caches in multicore environments andminimizes synchronization and boundary overhead. Benchmark results arepresented for three current x86-based microprocessors, showing clearly that ouroptimization works best on designs with high-speed shared caches and low memorybandwidth per core. We furthermore demonstrate that simple bandwidth-basedperformance models are inaccurate for this kind of algorithm and employ a moreelaborate, synthetic modeling procedure. Finally we show that temporal blockingcan be employed successfully in a hybrid shared/distributed-memory environment,albeit with limited benefit at strong scaling.

机译：带宽匮乏的多核芯片已经无处不在。众所周知，模板代码的性能可以通过临时阻塞来提高，从而减轻了存储接口上的压力。我们引入了一种新的管道化方法，该方法在多核环境中显式使用共享缓存，并最大程度地减少了同步和边界开销。给出了三个当前基于x86的微处理器的基准测试结果，清楚地表明，我们的优化在具有高速共享缓存和每个内核低内存带宽的设计上效果最佳。我们进一步证明，简单的基于带宽的性能模型对于这种算法是不准确的，并采用了更为复杂的综合建模程序。最后，我们证明了时间阻塞可以成功地在混合共享/分布式内存环境中使用，尽管在强扩展方面收益有限。

著录项

作者
Wittmann, Markus; Hager, Georg; Treibig, Jan; Wellein, Gerhard;
展开▼
作者单位

展开▼
年度 2010
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. LEVERAGING SHARED CACHES FOR PARALLEL TEMPORAL BLOCKING OF STENCIL CODES ON MULTICORE PROCESSORS AND CLUSTERS [J] . MARKUS WITTMANN GEORG HAGER JAN TREIBIG GERHARD WELLEIN Parallel Processing Letters . 2010 ,第4期

机译：利用共享缓存在多核处理器和集群上并行编码时间代码
2. LEVERAGING SHARED CACHES FOR PARALLEL TEMPORAL BLOCKING OF STENCIL CODES ON MULTICORE PROCESSORS AND CLUSTERS [J] . MARKUS WITTMANN, GEORG HAGER, JAN TREIBIG, Parallel Processing Letters . 2010 ,第4期

机译：利用共享缓存在多核处理器和集群上并行地临时阻塞代码
3. Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches [J] . Charles Yount, Alejandro Duran, Josh Tobin Future generation computer systems . 2019 ,第MARa期

机译：多级空间和时间分块，可在具有大型共享缓存的多核处理器上进行高效的HPC模具计算
4. Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory [C] . Wittmann M., Hager G., Wellein G. 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum . 2010

机译：模板代码的多核感知并行时间阻塞，用于共享和分布式内存
5. Auto-tuning stencil codes for cache-based multicore platforms. [D] . Datta, Kaushik. 2009

机译：自动调整基于缓存的多核平台的模具代码。
6. Spatio-temporal source cluster analysis reveals fronto-temporal auditory change processing differences within a shared autistic and schizotypal trait phenotype [O] . Talitha C. Ford, Will Woods, David P. Crewther 2017

机译：时空源聚类分析揭示了在自闭和分裂型性状共有表型中额颞听觉变化的处理差异
7. Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory [O] . Markus Wittmann, Georg Hager, Gerhard Wellein 2016

机译：用于共享和分布式存储器的模板代码的多核感知并行时间阻塞

Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters

摘要

著录项

相似文献

相关主题

期刊订阅